Resources for new research directions in speaker recognition: the mixer 3, 4 and 5 corpora
نویسندگان
چکیده
This paper describes new language resources designed to support research in speaker recognition. It begins with a brief overview of collections protocols, motivates the shift from the Switchboard protocol to the Mixer protocol, summarizes yields from the earliest phase of Mixer collection and then describes more recent phases, yields and expected yields and lessons learned.
منابع مشابه
Speaker Recognition: Building the Mixer 4 and 5 Corpora
The original Mixer corpus was designed to satisfy developing commercial and forensic needs. The resulting Mixer corpora, Phases 1 through 5, have evolved to support and increasing variety of research tasks, including multilingual and cross-channel recognition. The Mixer Phases 4 and 5 corpora feature a wider variety of channels and greater variation in the situations under which the speech is r...
متن کاملThe Mixer and Transcript Reading Corpora: Resources for Multilingual, Crosschannel Speaker Recognition Research
This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development.
متن کاملNew release of Mixer-6: Improved validity for phonetic study of speaker variation and identification
The Mixer series of speech corpora were collected over several years, principally to support annual NIST evaluations of speaker recognition (SR) technologies. These evaluations focused on conversational speech over a variety of channels and recording conditions. One of the series, Mixer-6, added a new condition, read speech, to support basic scientific research on speaker characteristics, as we...
متن کاملThe Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data
This paper describes efforts to create corpora to support and evaluate systems that perform speaker recognition where channel and language may vary. Beyond the ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and cross channel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium and the research ongoing at the US ...
متن کاملThe NIST 2010 speaker recognition evaluation
The 2010 NIST Speaker Recognition Evaluation continues a series of evaluations of text independent speaker detection begun in 1996. It utilizes the newly collected Mixer-6 and Greybeard Corpora from the Linguistic Data Consortium. Major test conditions to be examined include variations in channel, speech style, vocal effort, and the effect of speaker aging over a multi-year period. A new primar...
متن کامل